Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Parallel high utility pattern mining algorithm based on cluster partition
XING Shuning, LIU Fang'ai, ZHAO Xiaohui
Journal of Computer Applications    2016, 36 (8): 2202-2206.   DOI: 10.11772/j.issn.1001-9081.2016.08.2202
Abstract493)      PDF (844KB)(349)       Save
The exiting algorithms generate a lot of utility pattern trees based on memory when mining high utility patterns in large-scale database, leading to occupying more memory spaces and losing some high utility itemsets. Using Hadoop platform, a parallel high utility pattern mining algorithm, named PUCP, based on cluster partition was proposed. Firstly, the clustering method was introduced to divide the transaction database into several sub-datasets. Secondly, sub-datasets were allocated to each node of Hadoop to construct utility pattern tree. Finally, the conditional pattern bases of the same item which generated from utility pattern trees were allocated to the same node, reducing the crossover operation times of each node. The theoretical analysis and experimental results show that, compared with the mainstream serial high utility pattern mining algorithm named UP-Growth (Utility Pattern Growth) and parallel high utility pattern mining algorithm named HUI-Growth (Parallel mining High Utility Itemsets by pattern-Growth), the mining efficiency of PUCP is increased by 61.2% and 16.6% respectively without affecting the reliability of the mining results; and the memory pressure of large data mining can be effectively relieved by using Hadoop platform.
Reference | Related Articles | Metrics